-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added ability to use regex rules #133
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #133 +/- ##
==========================================
+ Coverage 95.90% 95.92% +0.01%
==========================================
Files 21 21
Lines 708 711 +3
==========================================
+ Hits 679 682 +3
Misses 18 18
Partials 11 11
Continue to review full report at Codecov.
|
I've been wanting this feature, but the schema I had in mind was something like this: rules:
- name: foo
terms:
- foo
- Foo
- name: bar
regex: "(?i)bar"
# when both are defined, `terms` wins, `regex` would be ignored, maybe log debug message that both are defined?
- name: baz
terms:
- baz
- BaZ
regex: "(?i)baz" |
docs/rules.md
Outdated
|
||
* If `true`, terms will be evaluated as regular expressions | ||
* If `false`, terms will be treated as plain-text values | ||
* **NOTE** this is an advanced feature. Rules will be skipped if they do not compile. Only use non-capturing groups in patterns. Look-around assertions are not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a fatal error should happen if a supplied regex can't compile. Otherwise the user may never know that their rule isn't being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - I added a Error log message, so when it runs if a rule won't compile, an error is generated but it continues, it looks like this:
2021-08-13T09:03:27-05:00 ERR Unable to compile regular expression, disabling rule error="error parsing regexp: missing closing ): `(?i)(black(\\s*list)`" Rule=rulename
I thought about letting it do a fatal error instead, though I chose this approach in anticipation of the remote ruleset feature - my thought was that if there is a rule that doesn't work in a remote ruleset that the user can't necessarily directly change, it could continue on after printing the error. That said, I'm totally happy to go either way with this - if you want it a fatal error that's easy to change, I can just change it back to a MustCompile instead, or doing a panic after logging the error so that they still get that debug info from that log. What do you prefer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my thought was that if there is a rule that doesn't work in a remote ruleset that the user can't necessarily directly change
I think we should tackle this when it becomes a problem or concern.
No problem, should be pretty straightforward. I'll change this. |
Oh - one more question. I see the (?i) in your example. I assume that means that you are intending that this wouldn't do the wrapping at all that it does for terms, so word_boundary, word_boundary_start, and word_boundary_end would also be ignored? That makes sense, but in that case should that also then put a debug message if any of those are defined just to make it clear what is happening? |
That's correct, if you supply |
Sorry for the delay - updated the code to follow the suggested changes above. |
GitHub Action run 1, commit: 1fe097d
I see that the benchmark tool is failing. I'm not sure why that would be, however, unless there is some instability in the benchmarking process itself. This change should not represent any runtime change in performance, and only a single additional conditional when compiling rules. If I'm missing something there though please let me know so that I can help resolve. |
Apologies for not responding for so long! The benchmark test is failing because it's supposed to create a comment on the PR with the results, but the github action that handles the PR commenting doesn't work on forks. I'm disabling that part in #147. Aside from that, looking at the output of the benchmark tool (which just runs |
@cognitivegears Please let me know if there are any updates. This is a nice feature. |
@caitlinelfring we're using this quite extensively at Canonical, and really need this feature. anything i could assist you with to get this through? |
Please check if the PR fulfills these requirements
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
New feature
What is the current behavior? (You can also link to an open issue here)
Currently there is no way to specify a regular expression in a rule. This means that some rules are difficult to create, for example:
What is the new behavior (if this is a feature change)?
Added a new option "regex_terms" on the rules to specify that they should be treated as regular expressions. If set to yes, then terms are not escaped before adding to the terms list and errors displayed / rules set to ignore from the regular expression compile on error.
Does this PR introduce a breaking change? (What changes might users need to make due to this PR?)
No
Other information:
N/A